A General Framework for Interacting Bayes-Optimally with Self-Interested Agents using Arbitrary Parametric Model and Model Prior
نویسندگان
چکیده
Recent advances in Bayesian reinforcement learning (BRL) have shown that Bayes-optimality is theoretically achievable by modeling the environment’s latent dynamics using Flat-DirichletMultinomial (FDM) prior. In self-interested multiagent environments, the transition dynamics are mainly controlled by the other agent’s stochastic behavior for which FDM’s independence and modeling assumptions do not hold. As a result, FDM does not allow the other agent’s behavior to be generalized across different states nor specified using prior domain knowledge. To overcome these practical limitations of FDM, we propose a generalization of BRL to integrate the general class of parametric models and model priors, thus allowing practitioners’ domain knowledge to be exploited to produce a fine-grained and compact representation of the other agent’s behavior. Empirical evaluation shows that our approach outperforms existing multi-agent reinforcement learning algorithms.
منابع مشابه
Parametric Empirical Bayes Test and Its Application to Selection of Wavelet Threshold
In this article, we propose a new method for selecting level dependent threshold in wavelet shrinkage using the empirical Bayes framework. We employ both Bayesian and frequentist testing hypothesis instead of point estimation method. The best test yields the best prior and hence the more appropriate wavelet thresholds. The standard model functions are used to illustrate the performance of the p...
متن کاملBayes, E-Bayes and Robust Bayes Premium Estimation and Prediction under the Squared Log Error Loss Function
In risk analysis based on Bayesian framework, premium calculation requires specification of a prior distribution for the risk parameter in the heterogeneous portfolio. When the prior knowledge is vague, the E-Bayesian and robust Bayesian analysis can be used to handle the uncertainty in specifying the prior distribution by considering a class of priors instead of a single prior. In th...
متن کاملComparison of Estimates Using Record Statistics from Lomax Model: Bayesian and Non Bayesian Approaches
This paper address the problem of Bayesian estimation of the parameters, reliability and hazard function in the context of record statistics values from the two-parameter Lomax distribution. The ML and the Bayes estimates based on records are derived for the two unknown parameters and the survival time parameters, reliability and hazard functions. The Bayes estimates are obtained based on conju...
متن کاملIntroducing of Dirichlet process prior in the Nonparametric Bayesian models frame work
Statistical models are utilized to learn about the mechanism that the data are generating from it. Often it is assumed that the random variables y_i,i=1,…,n ,are samples from the probability distribution F which is belong to a parametric distributions class. However, in practice, a parametric model may be inappropriate to describe the data. In this settings, the parametric assumption could be r...
متن کاملSemiparametrics, Nonparametrics and Empirical Bayes Procedures in Linear Models
In a classical parametric setup, a key factor in the implementation of the Empirical Bayes methodology is the incorporation of a suitable prior that is compatible with the parametric setup and yet lends to the estimation of the Bayes (shrinkage) factor in an empirical manner. The situation is more complex in semi-parametric and (ev,:,n more in) nonparametric models. Although the Dirichlet prior...
متن کامل